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Abstract 

Few  plan  recognition  algorithms  are  designed  to  tolerate 
input  errors.  We  describe  a  case-based  plan  recognition 
algorithm  (SET-PR)  that  is  robust  to  two  input  error  types: 
missing  and  noisy  actions.  We  extend  our  earlier  work  on 
SET-PR  with  more  extensive  evaluations  by  testing  the  utility 
of  its  novel  action-sequence  representation  for  plans  and  also 
investigate  other  design  decisions  (e.g.,  choice  of  similarity 
metric).  We  found  that  SET-PR  outperformed  a  baseline 
algorithm  for  its  ability  to  tolerate  input  errors,  and  that 
storing  and  leveraging  state  information  in  its  plan 
representation  substantially  increases  its  performance. 

1.  Introduction 

We  are  developing  an  intelligent  agent  to  control  a  robot  in 
joint  human-robot  team  missions.  This  robot  perceives  the 
actions  of  its  human  teammates,  recognizes  their  plans  and 
goals,  and  then  selects  its  actions  accordingly.  Our  plan 
recognizer  must  operate  on  action  information  perceived  by 
lower-level  perception  that  is  prone  to  errors  (i.e., 
mislabeled  and/or  missing  actions  in  the  observed  action 
sequences).  Thus,  error  tolerance  is  a  key  design  concern  for 
plan  recognition. 

We  describe  the  Single-agent  Error-Tolerant  Plan 
Recognizer  (SET-PR),  a  case-based  algorithm.  Plan 
recognition  algorithms  typically  employ  a  model  or  library 
of  plans  to  recognize  an  ongoing  plan  from  observed  action 
sequences.  SET-PR’ s  plan  representation  {action- sequence 
graphs )  encodes  (1)  knowledge  about  actions  performed  by 
an  observed  agent,  as  is  normally  done,  and  (2)  the 
subsequent  state.  That  is,  plans  in  SET-PR’ s  plan  library 
contain  action-state  sequences  rather  than  only  action 
sequences.  To  process  these,  SET-PR  performs  graph 
matching  to  retrieve  candidate  plans,  and  thus  must  compute 
similarity  efficiently.  Degree  sequence  similarity  metrics 
(e.g.,  Johnson,  1985;  Bunke  and  Shearer  1998;  Wallis  et  al. 
2001)  can  be  used  for  this  task,  but  it  is  not  clear  which  is 
preferable. 

In  §2  and  §3,  we  describe  related  work  and  SET-PR, 
respectively.  We  introduced  SET-PR  in  (Vattam  et  al.  2014) 
and  reported  its  performance  at  varying  levels  of  input  error. 
In  §4  we  extend  our  empirical  study  by  comparing  SET- 


PR’ s  ability  to  tolerate  input  errors  vs.  baselines,  and 
studying  how  its  plan  representation  and  choice  of  similarity 
function  influences  its  ability  to  tolerate  errors.  We  found 
support  for  our  hypotheses  that  SET-PR’ s  action-sequence 
graph  representation  for  plans  and  the  inclusion  of  state 
information  in  these  representations  increases  plan 
recognition  performance  in  the  presence  of  input  errors. 
Finally,  we  discuss  these  results  in  §5  and  provide 
concluding  remarks  in  §6. 

2.  Related  Work 

Several  plan  recognition  algorithms  (Sukthankar  et  al.  2014) 
have  used  consistency-based  (e.g.,  Hong  2001;  Kautz  and 
Allen,  1986;  Lesh  and  Etzioni  1996)  or  probabilistic  (e.g., 
Bui,  2003;  Charniak  and  Goldman,  1991;  Goldman  et  al. 
1999;  Pynadath  and  Wellman  2000)  approaches.  SET-PR 
exemplifies  a  less-studied  third  approach,  namely  case- 
based  plan  recognition  (CBPR)  (Cox  and  Kerkez  2006; 
Tecuci  and  Porter  2009).  Some  CBPR  algorithms  can  work 
with  incomplete  plan  libraries,  incrementally  learn  plans,  or 
respond  to  novel  inputs  outside  the  scope  of  their  plan 
library  using  plan  adaptation  techniques.  However,  to  our 
knowledge  none  have  been  designed  for  error-prone  inputs, 
which  is  our  focus. 

Cox  and  Kerkez  (2006)  proposed  a  novel  representation 
for  storing  and  organizing  plans  in  a  plan  library,  based  on 
action-state  pairs  and  abstract  states.  It  counts  the  number  of 
instances  of  each  type  of  generalized  state  predicate.  SET- 
PR  uses  a  similar  representation,  but  stores  and  processes 
plans  in  an  action- sequence  graph.  As  a  result,  our  similarity 
metrics  also  operate  on  graphs.  Our  encoding  was  inspired 
by  planning  encoding  graphs  (Serina  2010).  Although  there 
are  syntactic  similarities  among  these  two  types  of  graphs, 
important  semantic  differences  exist;  Serina’ s  graphs 
encode  a  planning  problem  while  ours  encode  a  solution 
(i.e.,  a  grounded  plan). 

Recently,  Maynord  et  al.  (2015)  integrated  SET-PR  with 
hierarchical  clustering  techniques  to  increase  its  retrieval 
speed.  Sanchez-Ruiz  and  Ontanon  (2014)  instead  use  Least 
Common  Subsumer  (LCS)  Trees  for  this  purpose.  In  this 
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paper,  we  focus  on  SET-PR’s  ability  to  tolerate  input  errors 
rather  than  methods  for  increasing  its  retrieval  speed. 


3.  SET-PR 

When  our  agent  receives  a  set  of  observations,  it  invokes 
SET-PR  to  obtain  a  hypothesized  plan  for  the  observed 
agents.  SET-PR  is  given  a  plan  library  C  (i.e.,  a  set  of  cases), 
where  a  case  is  a  tuple  c  =  (n0,  g0),  n0  is  a  (grounded)  plan, 
and  g0  is  a  goal  that  is  satisfied  by  7r0’s  execution. 

Each  plan  is  represented  as  an  action-state  sequence  §  = 
((a0,  s0), (an,  5n)>,  where  each  action  at  is  a  ground 
instance  of  an  operator  in  the  planning  domain,  and  st  is  the 
state  obtained  by  executing  at  in  s i_±.  We  represent  an 
action  a  in  (a,  s)  E  §  as  a  ground  predication  p  = 
p{o^:  tlf  tn),  where  p  E  P  (a  finite  set  of  predicate 

symbols),  ot  E  0  (a  finite  set  of  typed  constants 
representing  objects),  and  tt  is  an  instance  of  ot  (e.g., 
stack (block :  A,  block:B),  on(block:A, 

block  :B)).  A  state  s  in  (a,  s)  E  §  is  as  a  set  of  facts 
iPi>  V2>  **•  },  where  each  pt  is  a  predication. 

Inputs  to  SET-PR  are  also  represented  as  action-state 
sequences. 


3.1  Action-Sequence  Graphs 

SET-PR  uses  action-sequence  graphs  to  represent  action- 
state  sequences.  A  labeled  directed  graph  G  is  a  3 -tuple  G  = 
(V,  E,  A),  where  V  is  a  set  of  vertices,  E  Q  V  X  V  is  a  set  of 
edges,  and  A:  V  U  E  ->  fc>s(L)  assigns  labels  to  vertices  and 
edges.  Here,  an  edge  e  =  \v,  u\  E  E  is  directed  from  v  to  u, 
where  v  is  the  edge’s  source  node  and  u  is  the  target  node; 

L  is  a  finite  set  of  symbolic  labels;  and  £?5(L),  a  set  of  all  the 
multisets  on  L,  permits  multiple  non-unique  labels  for  a  node 
or  an  edge  (for  properties  of  £?S(L)  please  see  Serina 
(2010)). 

The  union  Gt  U  G2  of  two  graphs  G1  =  (yltEltA^)  and 
G2  =  (Y2  >^2  #^2)  is  the  graph  G  =  (V,  E,A),  where  V  = 
Vt  UV2,E  =  E1U  E2,  and 

[ Ax{x),if  x  E  (Vi  \  V2)  V x  E  (E,  \  E2) 

A(x)  =  j  A2 (x),  if  x  E  (P2  \  Vi)  V  x  E  (E2  \  Ex) 

^(x)  U  A2(x),  otherwise 

Definition:  Given  ground  atom  p  representing  an  action 
a  or  a  fact  of  state  5  in  the  kth  action- state  pair  (a,  s)k  E  s, 
a  predicate  encoding  graph  is  a  labeled  directed  graph 
£P(P)  =  (l^,  Ep,Ap)  where: 

!\Ak  ,  olt  ...f  on[,if  p  is  an  action 
f 

I Skp,  o1,  ...,onj,ifp  is  a  state  fact 

•  Ep  = 

\[Akp.  Oi\  u  Ui=i,n-i:;=i+i,n[°i'  Oj] ,  if  p  is  an  action 
(  [Sfcp-  Oi]  U  Ui=i,n-i:;=i+i,n[°i>  °;]  -  if  P  is  a  state  fact 


•  Ap{Akp)  \Akp)'  XP  (Skp)  {vp};  ^p(Oi)  (tj 

for  i  =  1,  ...,n 


•  A , 


1  -  {^fep1};  ap  ([5feP'°i])  -  Kp1)1 


v[o(,  oj]  E  Ep, 

Ap([Op  Oy])  = 


,  if  p  is  an  action 
{Skp7} ,  if  p  is  a  state  fact 


As  an  interpretation  of  this  definition  suppose  we  have  a 
predication  p  =  p(o1:t1,...,on\tn).  Depending  on  whether 
p  represents  an  action  or  a  state  fact,  the  first  node  of  the 
predicate  encoding  graph  £p(p)  is  either  Akp or  (labeled 

|i4fcp|  or  |SfcpJ).  Suppose  it  is  an  action  predicate.  Ak p  is 


then  connected  to  the  second  node  of  this  graph,  the  object 
node  o±  (labeled  {ti}),  through  the  edge  oxj  (labeled 

\^kp 1  ])•  Next,  ox  is  connected  to  the  third  node  o2  (labeled 

{t2})  through  the  edge  \olf  o2]  (labeled  |i4kp2|),  then  to  the 
fourth  node  o3  (labeled  {t3})  through  the  edge  [olfo3\ 
(labeled  |i4fcp3  j),  and  so  on.  Suppose  also  the  third  node  o2 
is  connected  to  o3  through  A k 2,3 ,  to  o4  through  Afc2'4,  with 
appropriate  labels,  and  so  on. 


Definition:  An  action- sequence  graph  of  an  action-state 
sequence  §  is  a  labeled  directed  graph  £§  = 
U(„,s)es(£(«)  u  u  pes  £(p)),  a  union  of  the  predicate 
encoding  graphs  of  the  actions  and  state  facts  in  §. 


Space  constraints  prevent  providing  more  detail.  Please 
see  (Vattam  et  al.  2014)  for  examples  of  action-sequence 
graphs  and  their  construction  from  action- state  sequences. 


3.2  Case  Retrieval 

SET-PR  matches  an  input  action-sequence  graph  §tar9et 
with  plans  in  the  cases  of  C.  The  case  c  =  ( n0>g0 )  whose 
plan  c.  7T0  is  most  similar  is  retrieved  as  the  recognized  plan, 
and  c.  g0  is  the  recognized  goal. 

To  match  graphs,  we  compute  their  maximum  common 
subgraph  (MCS).  Computing  the  MCS  between  two  or  more 
graphs  is  NP-Complete,  restricting  applicability  to  only 
small  plan  recognition  problems.  Alternatively,  many 
approximate  graph  similarity  measures  exist.  One  class  of 
such  similarity  metrics,  based  on  graph  degree  sequences , 
has  been  used  successfully  to  match  chemical  structures 
(Raymond  and  Willett  2002). 

Below,  we  describe  four  degree  sequence  similarity 
metrics  that  we  will  test  in  SET-PR.  These  metrics,  denoted 
as  simstr,  compute  plan  similarity  based  on  the  approximate 
structural  similarity  of  their  graph  representations. 

Let  Gx  and  G2  be  the  two  action-sequence  graphs  being 
compared.  First,  the  set  of  vertices  in  each  graph  is  divided 
into  l  partitions  by  label  type,  and  then  sorted  in  a  non- 


increasing  total  order  by  degree1.  Let  L\  and  L\  denote  the 
sorted  degree  sequences  of  a  partition  i  in  the  action- 
sequence  graphs  G1  and  G2 ,  respectively.  An  upper  bound 
on  the  number  of  vertices  V(Glf  G2 )  and  edges  E(Glf  G2 )  of 
the  MCS  of  these  two  graphs  can  then  be  computed  as: 


|mcs(G1G2)|  =  V(jGltG2 )  +  E(Glr  G2),  where 

i 

V(fii  ,G2)  =  ^mm(|Li|,|L'2|) 

i= 1 

;  mm(|z4|,|Ll2|) 


E(G1,G2) 


I  I 

i= 1  j=l 


min 


i"(lfWJ)|.|£(t72J)l) 


where  denotes  the  yth  vertex  of  the  L\  sorted  degree 
sequence,  and  E(v[,J)  denotes  the  set  of  edges  connected  to 
vertex  v[’J . 

We  consider  the  following  four  similarity  metrics,  which 
are  variations  on  the  above  properties. 


•  J  Johnson  (Johnson  1985): 


simstr(Glf  G2 ) 


(|mcs(G1G2)|)2 
|Gil ■ \G2\ 


B  Bunke  (Bunke  and  Shearer  1998): 


(|mcs(G1G2)|) 
Lstr^l,U2J-  max(|GlUG2l) 

•  W  Wallis  (Wallis  et  al.  2001): 


simstr(G1,  G2)  = 


simstriG^  G2) 


(ImcsCGiG^I)2 


I  Gil  +  |G2|  -  |mcs(G1,G2)| 
•  S  Simpson  (Ellis  et  al.  1993): 


simstriG^  G2) 


(|mcs(G1G2)|) 

min(|G1|,|G2|) 


Two  plans  that  are  similar  in  structure  can  differ 
drastically  in  semantics.  For  instance,  a  plan  to  travel  to  a 
grocery  store  to  buy  milk  might  coincidentally  be 
structurally  similar  to  a  plan  to  travel  to  the  airport  to  receive 
a  visitor.  To  mitigate  this  issue,  we  use  a  weighted 
combination  of  structural  similarity  and  semantic  similarity, 
denoted  as  simobj,  as  our  final  similarity  metric: 

sim(G1;  G2 )  =  a  simstr(G1;  G2)  +  (1  -  a)sim obj(Glf  G2)  , 
osnon. 

where  simobj  (Glf  G2)  = - 1  is  the  Jaccard  coefficient  of  the 

7  OsUOn. 

set  of  (grounded)  objects  in  G±  and  G2 ,  and  a  (0  <  a  <  1) 
governs  the  weights  associated  with  simstr  and  sim obj. 


4.  Empirical  Study 

Our  empirical  study  builds  on  our  earlier  pilot  study  (Vattam 
et  al.  2014),  where  we  tested  SET-PR  at  varying  input  error 


levels  but  did  not  compare  it  to  a  baseline.  Also  we  did  not 
compare  different  variants  of  SET-PR.  In  this  study,  we 
investigated  the  following  hypotheses: 

HI:  SET-PR’s  action-sequence  graph  representation  for 
plans  increases  recognition  performance  in  the 
presence  of  input  errors. 

H2:  Including  state  information  in  input  action  sequences 
and  plans  improves  error  tolerance. 

H3:  Combining  structural  and  semantic  similarity 
outperforms  using  either  in  isolation. 

3.1  Empirical  Method 

We  compared  the  performance  of  a  baseline  algorithm  with 
three  versions  of  SET-PR,  all  using  J  for  graph  matching. 
(We  consider  the  other  similarity  metrics  in  §4.) 

•  Baseline:  Inputs  and  plans  contained  (action)  sequences 
(no  state  information),  treated  as  symbols  (not  converted 
to  a  graph  representation);  matching  was  computed 
using  edit  distance  (no  graph  matching). 

•  SET-PR[A,0.5]:  Inputs  and  plans  contained  (action) 
sequences  (no  state  information),  represented  as  action- 
sequence  graphs;  a  =  0.5  (equal  weights  for  structural 
and  semantic  similarity). 

•  SET-PR[AS,0.5]:  Inputs  and  plans  contained  (action, 
state)  sequences,  represented  as  action-sequence  graphs; 
a  =  0.5. 

•  SET-PR[AS,0.33] :  This  is  a  variant  in  which  a  =  0.33 
(slightly  lower  weight  for  structural  similarity). 

We  conducted  our  experiments  in  the  paradigmatic 
blocks  world  domain  because  it  is  simple  and  permits  the 
quick  automatic  generation  of  a  plan  library  with  the  desired 
characteristics.  We  used  the  hierarchal  task  network  (HTN) 
planner  SHOP2  (Nau  et  al.  2003)  to  generate  plans  for  our 
library.  Planning  problems  were  created  by  randomly 
selecting  initial  and  goal  states  (ensuring  that  the  goal  can 
be  reached  from  the  initial  state),  and  given  as  input  to 
SHOP2.  The  number  of  blocks  used  to  generate  the  plans 
ranged  from  9  to  12.  We  used  this  method  to  generate  100 
plans  for  our  library.  The  average  plan  length  was  12.48.  In 
the  baseline  condition,  we  stored  the  generated  plan  along 
with  the  goal  as  a  case  in  the  case  base.  In  the  non-baseline 
conditions,  the  generated  plan  was  converted  into  an  action- 
sequence  graph  (using  actions  only  in  SET-PR[A,0.5],  and 
using  actions  and  states  in  SET-PR[AS, 0.5/0. 33]),  and 
stored  along  with  the  goal  as  a  case  in  the  case  base. 

We  used  the  following  plan  recognition  metrics  (Blaylock 
and  Allen  2005):  (1)  precision,  (2)  convergence  rate,  and  (3) 
convergence  point.  To  understand  these  metrics,  consider  a 
plan  recognition  session  in  which  the  recognizer  is  given  x 
input  actions,  which  are  streamed  sequentially.  After 
observing  each  action,  the  recognizer  uses  the  available 


[The  degree  of  a  vertex  v  of  a  graph  is  the  number  of  edges  that  touch  v. 


action  sequence  to  query  and  predict  a  plan.  The  first  query 
will  be  (%),  the  second  (alfa2),  and  so  on  until 
(alf  a2,  •••  ax)  (in  SET-PR[AS]  these  will  be  action-state 
sequences).  Each  session  consists  of  x  queries  and 
predictions.  Precision  reports  the  number  of  correct 
predictions  divided  by  total  predictions  for  a  single  session.2 * 
Convergence  is  true  if  the  correct  plan  is  predicted  by  the 
end  of  the  session  and  false  otherwise.  If  a  correct 
prediction  is  followed  by  an  incorrect  prediction  at  any  point 
in  the  observation  session,  the  convergence  flag  will  be  reset 
to  false.  Convergence  rate  is  the  percentage  of  sessions  that 
converged  to  true.  If  a  session  converges,  convergence  point 
reports  the  number  of  actions  after  which  the  session 
converged  to  the  correct  plan  divided  by  the  total  number  of 
actions.  Convergence  point  is  averaged  only  for  those 
sessions  that  converge.  Lower  values  for  convergence  point 
indicate  better  performance,  whereas  higher  values  for 
convergence  rate  and  precision  indicate  better  performance. 

We  evaluated  the  plan  recognition  metrics  using  the 
leave-one-in  (Aha  and  Breslow  1997)  testing  strategy  as 
follows.  For  each  randomly  selected  case  c  =  (n0,g0)  E  C, 
we  copied  plan  n ,  randomly  distorted  its  action-sequence 
((null,  s0),  (ai>si)>  —>  (.ag>sg))  to  introduce  a  fixed  and 
equal  amount  of  mislabeled  and  missing  error  (for 
mislabeled,  a  specified  percentage  of  actions  in  n  was 
randomly  chosen,  and  each  was  replaced  with  another  action 
randomly  chosen  from  the  domain;  for  missing,  a  specified 
percentage  of  actions  was  randomly  chosen,  and  each  was 
replaced  with  an  unidentified  marker  4*\).  This  distorted 
plan  was  used  as  an  incremental  query  to  SET-PR  (i.e., 
initially  with  only  its  first  (action,  state)  pair,  and  then 
repeatedly  adding  the  next  such  pair  in  its  sequence).  The 
error  levels  tested  were  {10%,  20%,  30%,  ...,  90%}. 

3.2  Results 

Figure  1  plots  performance  versus  error  levels  across  three 
metrics.  For  convergence  rate,  SET-PR[AS,0.5]  and  SET- 
PR[AS,0.33]  outperformed  Baseline  and  SET-PR[A,0.5]. 
Baseline’s  convergence  fell  sharply  between  10%  and  20% 
error  rate,  while  SET-PR[A,0.5]’s  degradation  was  more 
gradual,  though  it  reached  low  levels  at  40%.  SET- 
PR[AS,0.5]  and  SET-PR[AS,0.33]  maintained  a 
convergence  rate  of  35%  to  50%  even  at  higher  error  rates. 

For  average  precision,  in  the  absence  of  any  error, 
Baseline’s  precision  was  higher  than  the  SET-PR  variants. 
This  is  because  SET-PR’ s  approximate  graph  matching 
technique  used  can  assign  the  same  score  to  multiple  plans 
with  minor  differences,  in  which  case  a  random  plan  was 
selected.  This  can  reduce  average  precision.  With  greater 
plan  diversity  in  the  library,  we  conjecture  the  performance 
of  SET-PR  will  be  similar  to  that  of  the  Baseline  in  the  zero 


%  input  error 


Figure  1:  Performance  of  Baseline  and  three  variations  of  SET- 
PR  with  a  varying  input  error  rate  using  three  metrics. 

error  case.  However,  in  the  presence  of  error  Baseline’s 
precision  fell  sharply.  Again,  SET-PR[AS,0.5]  and  SET- 
PR[AS,0.33]  performed  best  for  higher  error  levels. 

For  average  convergence  point,  Baseline  recorded  the 
lowest  values,  but  that  is  not  indicative  of  its  superior 
performance  because  its  convergence  rate  is  low  even  in  the 
10%  input  error  condition  (the  converged  set  had  too  few 
data  points).  Similarly,  the  convergence  point  for  SET- 
PR[A,0.5]  does  not  afford  meaningful  comparison  beyond 
the  20%  error  rate.  Only  SET-PR[AS,0.5]  and  SET- 
PR[AS,0.33]  can  be  meaningfully  compared  in  this  test,  and 
SET-PR[AS,0.33]  performed  better  at  all  error  rates. 

For  convergence  rate  and  precision,  Baseline  performed 
comparatively  poorly  at  all  non-zero  error  levels, 
particularly  when  using  action-state  sequence 
representations  in  SET-PR,  lending  some  support  to  HI. 
These  results  also  lend  support  to  H2;  actions-only  SET- 


2  This  should  not  be  confused  with  typical  precision/recall  definitions 

involving  false  positives  and  false  negatives. 


%  input  error 

Figure  2:  Performance  of  SET-PR[AS],  varying  from  purely 
semantic  ( a  —  0)  to  purely  structural  ( a  —  1)  similarity. 


PR[A,0.5]  was  outperformed  by  the  other  two  variants  with 
state  information  on  all  three  metrics  at  all  non-zero  error 
levels  (with  the  exception  of  convergence  point  at  high  error 
levels  characterized  by  a  low  convergence  rate). 

Figure  2  displays  the  performance  of  SET-PR[AS]  with 
levels  of  a  ranging  from  0.0  (purely  semantic  similarity)  to 
1.0  (purely  structural)  in  four  increments.  This  shows  that 
using  only  semantic  similarity  performs  well  on  two  metrics 
but  poorly  on  a  third,  while  purely  structural  similarity 
performs  poorly  for  two  of  the  metrics.  The  best  overall 
performance  was  attained  when  a  =  0.33  taking,  all  three 
parameters  into  account.  This  lends  support  to  H3. 

4.  Discussion 

Our  results  for  H2  suggest  that  plan  representations  rich  in 
state  information,  such  as  SET-PR’ s  action-state  sequences, 
enable  more  informed  predictions  because  states  capture  the 
context  of  actions.  Plan  recognition  techniques  that  rely 


Figure  3:  Convergence  rate  of  SET-PR[AS,0.33]  with  different 
degree  sequence  similarity  metrics,  given  (a)  both  mislabeled 
and  missing  actions,  and  (b)  only  missing  actions. 


solely  on  actions  exhibit  brittleness  even  when  a  small 
proportion  of  input  actions  are  mislabeled  or  missing.  Our 
results  for  H3  suggests  that  SET-PR  is  sensitive  to  a 
(structural  vs  semantic  similarity),  but  the  best  observed 
value  of  a  =  0.33  could  be  domain  specific.  In  future  work 
with  other  domains,  we  will  assess  the  extent  to  which  SET- 
PR  is  sensitive  to  a. 

Our  study  in  §3  may  be  influenced  by  several  factors  (e.g., 
the  similarity  metric  that  was  used).  In  an  exploratory 
analysis,  we  examined  whether  J  was  an  appropriate  choice 
by  testing  all  four  similarity  metrics  using  SET- 
PR[AS,0.33].  As  Figure  3(a)  shows  for  convergence  rate, 
while  performance  deteriorates  with  increasing  levels  of 
input  error,  J  performs  on  par  with  W,  and  outperformed  S 
substantially.  Similarly,  we  found  that  J,  W,  and  B 
outperformed  S  on  precision,  while  S  performed  well  on 
convergence  point,  though  this  is  not  indicative  of  its 
superior  performance  because  its  convergence  rate  was  low 
beyond  the  20%  error  rate  (i.e.,  too  few  data  points  in  the 
converged  set  to  derive  a  trend).  This  suggests  that,  for  these 
studies,  J  is  an  appropriate  choice. 

However,  a  factorial  study  of  other  design  choices  would 
reveal  a  more  complicated  story.  For  example,  Figure  3(b) 
plots  the  convergence  rate  for  the  same  algorithms  but  with 
input  errors  containing  only  one  type  of  error  (missing 
actions).  In  contrast  to  using  the  other  similarity  metrics,  the 
convergence  rate  of  SET-PR[AS,0.33]  for  S  does  not 
deteriorate  with  higher  error  levels.  Our  conjecture  is  that  S 
is  more  sensitive  to  the  size  of  the  MCS,  and  theoretical 


analyses  may  reveal  that  the  MCS  (as  a  percentage  of  graph 
size)  of  two  randomly- sampled  graphs,  for  higher  error 
rates,  is  much  higher  when  the  errors  are  constrained  to 
missing  actions.  We  will  test  for  this  in  future  work,  and 
whether  this  behavior  is  limited  to  our  current  domain  and 
plan  libraries. 

5.  Summary  and  Future  Work 

We  described  SET-PR,  a  case-based  approach  to  the 
problem  of  plan  recognition  that  can  tolerate  mislabeled  and 
missing  actions  in  the  input  action  sequences.  We 
highlighted  SET-PR’ s  case  representation  {action- sequence 
graphs )  and  SET-PR’ s  similarity  function,  which  combines 
degree  sequences  similarity  and  semantic  similarity  for 
matching  action-sequence  graphs.  We  described  an 
empirical  study  where  we  found  evidence  to  support  our 
hypotheses  that  SET-PR’ s  action-sequence  graph 
representation  for  plans  and  the  inclusion  of  state 
information  in  these  representations  increases  plan 
recognition  performance  in  the  presence  of  input  errors.  We 
also  found  that  combining  structural  and  semantic  similarity 
outperforms  using  either  in  isolation. 

For  future  work,  we  will  conduct  a  factorial  study  of  our 
design  choices,  with  the  objective  of  explaining  some  of  the 
trends  that  we  observed  (e.g.,  why  SET-PR[AS,0.33]  with 
Simpson’s  similarity  metric  maintained  a  high  convergence 
rate  even  at  higher  levels  of  errors).  We  will  also  integrate 
and  test  SET-PR  in  other  domains,  including  a  human-robot 
teaming  domain.  We  will  also  compare  SET-PR’ s 
performance  with  that  of  other  state-of-the-art  plan 
recognizers  in  the  presence  of  input  errors.  Finally,  we  will 
examine  more  sophisticated  graph  similarity  metrics  (e.g., 
graph  kernels)  and  compare  them  against  the  simple  degree 
sequence  metrics  that  SET-PR  currently  uses. 
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